19 research outputs found

    Tile2Vec: Unsupervised representation learning for spatially distributed data

    Full text link
    Geospatial analysis lacks methods like the word vector representations and pre-trained networks that significantly boost performance across a wide range of natural language and computer vision tasks. To fill this gap, we introduce Tile2Vec, an unsupervised representation learning algorithm that extends the distributional hypothesis from natural language -- words appearing in similar contexts tend to have similar meanings -- to spatially distributed data. We demonstrate empirically that Tile2Vec learns semantically meaningful representations on three datasets. Our learned representations significantly improve performance in downstream classification tasks and, similar to word vectors, visual analogies can be obtained via simple arithmetic in the latent space.Comment: 8 pages, 4 figures in main text; 9 pages, 11 figures in appendi

    Understanding the Requirements for Surveys to Support Satellite-Based Crop Type Mapping: Evidence from Sub-Saharan Africa

    No full text
    This paper provides recommendations on how large-scale household surveys should be conducted to generate the data needed to train models for satellite-based crop type mapping in smallholder farming systems. The analysis focuses on maize cultivation in Malawi and Ethiopia, and leverages rich, georeferenced plot-level data from national household surveys that were conducted in 2018–20 and integrated with Sentinel-2 satellite imagery and complementary geospatial data. To identify the approach to survey data collection that yields optimal data for training remote sensing models, 26,250 in silico experiments are simulated within a machine learning framework. The best model is then applied to map seasonal maize cultivation from 2016 to 2019 at 10-m resolution in both countries. The analysis reveals that smallholder plots with maize cultivation can be identified with up to 75% accuracy. Collecting full plot boundaries or complete plot corner points provides the best quality of information for model training. Classification performance peaks with slightly less than 60% of the training data. Seemingly little erosion in accuracy under less preferable approaches to georeferencing plots results in the total area under maize cultivation being overestimated by 0.16–0.47 million hectares (8–24%) in Malawi

    Corrigendum: Satellite detection of cover crops and their effects on crop yield in the Midwestern United States (2018 Environ. Res. Let. 13 064033)

    No full text
    The original raw dataset used to generate this work contained a number of duplicate entries—roughly 7% of the total farm fields. The substantive majority of these were from one large farm that had conducted their operations in a way that caused duplication as a side effect in our data generation process. Unfortunately, as the error was in the raw dataset, its correction required a re-run of the entire data pipeline, resulting in numerous small downstream changes. With respect to the most important numbers, the accuracy of the classifier went down slightly from 91.5% to 91.2% measured in absolute terms but increased from 0.68 to 0.74 measured by kappa. The trend in cover cropped acres grew slightly stronger, and the yield effects in maize and soybean moved from 0.65% to 0.71% and 0.35% to 0.29% respectively. None of the overall conclusions of the work have materially changed. Below, we provide all changes to the applicable sections of the original manuscript in bold underscore (or strikethrough) where applicable, in addition to modified versions of the corresponding figures and supplementary materials

    Understanding the Requirements for Surveys to Support Satellite-Based Crop Type Mapping: Evidence from Sub-Saharan Africa

    No full text
    This paper provides recommendations on how large-scale household surveys should be conducted to generate the data needed to train models for satellite-based crop type mapping in smallholder farming systems. The analysis focuses on maize cultivation in Malawi and Ethiopia, and leverages rich, georeferenced plot-level data from national household surveys that were conducted in 2018–20 and integrated with Sentinel-2 satellite imagery and complementary geospatial data. To identify the approach to survey data collection that yields optimal data for training remote sensing models, 26,250 in silico experiments are simulated within a machine learning framework. The best model is then applied to map seasonal maize cultivation from 2016 to 2019 at 10-m resolution in both countries. The analysis reveals that smallholder plots with maize cultivation can be identified with up to 75% accuracy. Collecting full plot boundaries or complete plot corner points provides the best quality of information for model training. Classification performance peaks with slightly less than 60% of the training data. Seemingly little erosion in accuracy under less preferable approaches to georeferencing plots results in the total area under maize cultivation being overestimated by 0.16–0.47 million hectares (8–24%) in Malawi

    Mapping Smallholder Yield Heterogeneity at Multiple Scales in Eastern Africa

    No full text
    Accurate measurements of crop production in smallholder farming systems are critical to the understanding of yield constraints and, thus, setting the appropriate agronomic investments and policies for improving food security and reducing poverty. Nevertheless, mapping the yields of smallholder farms is challenging because of factors such as small field sizes and heterogeneous landscapes. Recent advances in fine-resolution satellite sensors offer promise for monitoring and characterizing the production of smallholder farms. In this study, we investigated the utility of different sensors, including the commercial Skysat and RapidEye satellites and the publicly accessible Sentinel-2, for tracking smallholder maize yield variation throughout a ~40,000 km2 western Kenya region. We tested the potential of two types of multiple regression models for predicting yield: (i) a “calibrated model”, which required ground-measured yield and weather data for calibration, and (ii) an “uncalibrated model”, which used a process-based crop model to generate daily vegetation index and end-of-season biomass and/or yield as pseudo training samples. Model performance was evaluated at the field, division, and district scales using a combination of farmer surveys and crop cuts across thousands of smallholder plots in western Kenya. Results show that the “calibrated” approach captured a significant fraction (R2 between 0.3 and 0.6) of yield variations at aggregated administrative units (e.g., districts and divisions), while the “uncalibrated” approach performed only slightly worse. For both approaches, we found that predictions using the MERIS Terrestrial Chlorophyll Index (MTCI), which included the red edge band available in RapidEye and Sentinel-2, were superior to those made using other commonly used vegetation indices. We also found that multiple refinements to the crop simulation procedures led to improvements in the “uncalibrated” approach. We identified the prevalence of small field sizes, intercropping management, and cloudy satellite images as major challenges to improve the model performance. Overall, this study suggested that high-resolution satellite imagery can be used to map yields of smallholder farming systems, and the methodology presented in this study could serve as a good foundation for other smallholder farming systems in the world
    corecore